Semantic Dictionary Encoding
falvotech.comยท2hยท
Discuss: Hacker News
๐ŸŒ€Brotli Dictionary
Ancient Scripts, Modern AI: Bridging the Divide with Morphology-Aware Tokenization by Arvind Sundararajan
dev.toยท1dยท
Discuss: DEV
๐Ÿ“Concrete Syntax
UTF-8 Is Beautiful
hackaday.comยท11h
๐Ÿ”ฃUnicode
OTW - Bandit Level 4 to Level 5
tbhaxor.comยท11h
๐Ÿ”งKAITAI
Weasel words and co.: Guide to recognising AI-generated texts on Wikipedia
heise.deยท32m
๐Ÿ–‹Typography
Preserving the digital legacy of company archives: Last stop, Newhaven.
dpconline.orgยท8h
๐Ÿ’พData Preservation
Challenges You Will Face When Parsing PDFs with Python
theseattledataguy.comยท1hยท
Discuss: Hacker News
๐Ÿ“„PDF Archaeology
Vibe Graveyard
vibegraveyard.aiยท6h
๐Ÿ“ผCassette Culture
Lessons from using AI in Discovery
thoughtbot.comยท16h
๐Ÿ•ต๏ธMetadata Mining
A Kevin week
blog.mitrichev.chยท19hยท
๐Ÿ“Linear Algebra
Show HN: Semlib โ€“ Semantic Data Processing
github.comยท2hยท
Discuss: Hacker News
๐ŸŒณIncremental Parsing
Learn How to Use Transformers with HuggingFace and SpaCy
towardsdatascience.comยท3h
๐ŸŽฏDependent Parsing
IETF Draft: Authenticated Transfer Repo and Sync Specification
ietf.orgยท5hยท
Discuss: Hacker News
๐ŸŒณArchive Merkle Trees
Valuable News โ€“ 2025/09/15
vermaden.wordpress.comยท8h
๐Ÿ”ŒOperating system internals
WorldCat Editions and Holdings Release
annas-archive.orgยท1dยท
Discuss: Hacker News
๐Ÿ“šMARC Records
Language Models Pack Billions of Concepts into 12,000 Dimensions
nickyoder.comยท12hยท
๐ŸงฎKolmogorov Complexity
Sindhi Halchal Archive: Building on the PG Sindhi Library
digitalorientalist.comยท3d
๐ŸŒWeb Archiving
ISO C++ committee has a new chief sheep herder
shape-of-code.comยท18h
๐Ÿ“œProof Carrying Code
How to Train an LLM-Recommender Hybrid that Speaks English & Item IDs
eugeneyan.comยท1d
๐Ÿ”Information Retrieval